Degraded character recognition from old Kannada documents
نویسندگان
چکیده
This paper addresses preparation of a dataset Kannada characters which are degraded and robust recognition such characters. The proposed algorithm extracts the histogram oriented gradients (HOG) features block sizes 4x4 8x8 followed by principal component analysis (PCA) feature reduction. Various classifiers experimented with fine K-nearest neighbor classifier performs best. performance model is evaluated using 5-fold cross validation method receiver operating characteristic curve. devised size 10440 having 156 classes (distinct characters). These from 75 pages not well preserved old books. A comparison other like Haar wavelet Geometrical suggests that superior. It observed PCA reduced resulted in best accuracy acceptance rate 98.6% 97.9% for respectively. experimental results show HOG extraction has high system even extensively
منابع مشابه
Enhancement of Degraded Historical Kannada Documents
Historical documents play a vital role in understanding our past and hence need to be preserved. Over the period, these documents tend to possess many variations like stains, strain, ink seepage, dust etc. Image enhancement techniques can be utilized to improve the quality of these images by removing noise and increasing contrast range. The proposed method mainly deals with enhancing the histor...
متن کاملFeature Based Kannada Character Classification Method of Kannada Character Recognition
In Kannada language there are more than 510 character sets to be recognized including the vowels, consonants, consonant modified by vowel and consonant conjuncts the classification of characters becomes difficult. In this paper we present a new classification method of Kannada characters which can be used as a preliminary step for recognition. An analysis of Kannada characters was done and synt...
متن کاملDegraded Character Recognition
The DCR application for Degraded Character Recognition was developed for the DEA (Diplôme d’Études Approfondies)’s thesis. The main objective is to recognize characters, degraded by an acquisition from a lowresolution camera. Several steps are needed and are detailed in this thesis from the thresholding step, which converts a gray-level picture in a black and white one, to the post-correction s...
متن کاملOptical Character Recognition from Degraded Document Images
Segmentation of the text from badly degraded document images is very challenging tasks due to the high inter/intra variation between the document background and the foreground text of different types of document images. In this paper, a novel document image binarization technique is used to addresses the issues in the degraded document images by using adaptive image contrast. The adaptive image...
متن کاملKannada Character Recognition System A Review
Intensive research has been done on optical character recognition ocr and a large number of articles have been published on this topic during the last few decades. Many commercial OCR systems are now available in the market, but most of these systems work for Roman, Chinese, Japanese and Arabic characters. There are no sufficient number of works on Indian language character recognition especial...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Power Electronics and Drive Systems
سال: 2022
ISSN: ['2722-2578', '2722-256X']
DOI: https://doi.org/10.11591/ijece.v12i4.pp3632-3641